CIS 033
Mission College Robotics Class Fall 2019
Final Project
Dave Goeke
December 9, 2019




Make a computing cluster from two Raspberry Pi computers


I have read about people making massive clusters with thousands of Pis, and others with two or four. The CIS-033 class presented the opportunity to see if I could learn to make and test a small cluster. In this paper is a description of what I did, lessons learned, problems, workarounds, and finally some programs to test clustering.

The picture below shows two Pi
computers in a cluster, wired to communicate with each other via wired ethernet over a crossover cable. The screen shows execution of a test program to compute pi to 24 digits on the two computers.

Raspberry Pi cluster


Install OS on SD Card


I have two old Raspberry Pi 3 Model B V 1.2 computers. One has working WiFi. The only SD cards I have are 64GB which are problematic according to various web sites including the Pi site. Using the SD Card Formatter and Balena Etcher, programs to format the SD card and copy Raspian on to the card, resulted in an image that was not bootable. Various experiments on my Mac and Linux system were equally unsuccessful. Finally I formatted the cards using an Andriod phone of all things, which created a 28 GB FAT32 partition that I could put Raspian on to and boot from.


There are too many sites, documents, and videos available on the web explaining how to make a cluster. After several time-consuming experiments I settled on a circa 2012 procedure document from the University of Southampton in the U.K. which seemed appropriately academic and explained clearly enough how to use verification programs that would adequately demonstrate how to test a cluster once built.


I pulled down the latest copy of Raspian, with a build date of 9/29/2019, The Raspian installation program has options to update the OS with all latest maintenance during installation and configuration. This process seemed to take hours, and a simple line-mode apt-get install update accomplished the same objectives in less than ten minutes.


Networking Issues


One of Pi boards had a working WiFi interface, giving it two working network interfaces. During maintenance and tests this seemed to confuse Raspian and the test programs, so I had to “down” one of the interfaces, depending on whether I was testing the cluster or downloading maintenance updates. The documentation is also confusing on how to configure a static IP on the wired ethernet but I got it working with updates to /etc/ dhcpcd.conf, /etc/network/interfaces, and /boot/config.txt. Since there are only two computers in the cluster I wired them together directly through a cross-over cable. More than two Pis would have required a switch.



Using MPICH


Clustering seems to be an established practice and the add-ons consisted mostly of a package called mpich, an open-source implementation of Message Processing Interface. It is a standard message-passing tool for distributed memory applications in parallel computing. It also required installation of GNU Fortran. Mpich comes with a suite of test and verification programs to prove the cluster is working.


Configuration and Setup



One of the Pi computers had working WiFi, allowing access over the air using SSH, to do configuration, setup, and test execution. From there, and over a wired interface, it was possible to SSH into the second Pi to do the same setup and configuration work. I also used a serial connection from my laptop but the screen is small and it did not work very well.


Source Documentation


Details are in the attached instruction pages from Southampton University: http://www.southampton.ac.uk/~sjc/raspberrypi/ pi_supercomputer_southampton_web.pdf. Cloning a 64GB SD card took a long time so I just installed and configured everything twice for the two Pi computers


IVP


The installation verification program I used came with version 2 of MPICH. It computes Pi to 21 digits. With both the wired and wireless interfaces up the devices did not communicate. Would have to “down” the wireless interfaces for communication over the wire to work.


Session Log


The session log below shows a serial connection to the “Master” pi in the cluster from a Mac laptop:

The two devices are queried through mpich
The CPI (compute Pi) program is run using mpich
The wired ethernet interface is queried showing an IP address of 172.16.0.2
The worker Pi is pinged at 172.16.0.3


SandyFreBSDUnix:~ daveg$


SandyFreBSDUnix:~ daveg$ screen /dev/cu.usbserial-1410


pi@MASTER:~$


pi@MASTER:~$ mpiexec -f machinefile -n 2 hostname

MASTER
WORKER1


pi@MASTER:~$ mpiexec -f machinefile -n 2 ~/mpich_build/examples/cpi

Process 0 of 2 is on MASTER
Process 1 of 2 is on WORKER1
pi is approximately 3.1415926544231318, Error is 0.0000000008333387 wall clock time = 0.002047


pi@MASTER:~$


pi@MASTER:~$ ifconfig eth0

eth0:
flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.16.0.2 netmask 255.255.0.0 broadcast 172.16.255.255
inet6 fe80::3b3:a83:9c43:5a1c prefixlen 64 scopeid 0x20<link>
ether b8:27:eb:93:7d:ed txqueuelen 1000 (Ethernet)
RX packets 384 bytes 34441 (33.6 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 333 bytes 44838 (43.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


pi@MASTER:~$ ping 172.16.0.3

PING 172.16.0.3 (172.16.0.3) 56(84) bytes of data. 64 bytes from 172.16.0.3: icmp_seq=1 ttl=64 time=0.583 ms
64 bytes from 172.16.0.3: icmp_seq=2 ttl=64 time=0.488 ms
64 bytes from 172.16.0.3: icmp_seq=3 ttl=64 time=0.441 ms
64 bytes from 172.16.0.3: icmp_seq=4 ttl=64 time=0.481 ms

^C

--- 172.16.0.3 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 122ms
rtt min/avg/max/mdev = 0.441/0.498/0.583/0.054 ms


pi@MASTER:~$